March 11: I was worried

March 11: Very worried

March 15: A bit less worried

March 25: Holy sh*t!

58,000 = 2% of PR population!

Mathematical Models versus Statistical Models versus EDA

  • Mathematical: make prediction based on differential equations designed to describe how nature works

  • Statistical: ajust models to observed data. We tend to limit to few models that have worked well in other scenarios.

  • Visualization: less assumptions, but permits us to see patterns.

Italy and Spain up to March 9

Italy and Spain up to March 25

Italy and Spain up to today

New York and MA up to March 25

Puerto Rico and Brazil up to March 25

Can we predict based on data?

“It’s hard to make predictions especially about the future” - Yogi Berra

“Forecasting s-curves is hard” - Constance Crozier

Prediction is hard

Here is what ended up happening

Positivity Rate

  • We know transmission is high
  • Lockdowns can control these
  • Perhaps the most important number is the number of infected \(I(t)\)
  • How do we estimate it?
  • The number of reported cases is actually a bad estimate

Cases in Puerto Rico

Source: Puerto Rico Institute of Statistics

Less cases in the weekends?

Is it really growing this much in US?

The number of tests obviously affects this

Positivity rate takes this into account

Note that this rate is not growing in all US states

Positivity rate

Tries to estimate prevalence

Tasa de positividad

If we only perform tests on symptomatic this estimate will be biased

Positivity rate

If we do more universal tests, the estimate is less biased.

Positivity rate

But cases will grow with the number of tests regardless of prevalence.

Problem in Puerto Rico

We did not know how many tests were being performed.

Massachusetts Dashboard

Dashboard de Massachusetts

Positivity rate is their main indicator

Massachusetts Dashboard

They also look at hospitalizations and deaths

Here is what JHU dashboard had about PR

Here is what JHU dashboard had about PR

Here is what JHU dashboard had about PR

Excess deaths

  • Were we underestimating the pandemic
  • Given lack of reliable COVID-19 data we can look at excess deaths

Excess deaths in US

Excess Deaths in PR

Deaths in March 2019: 2489

Deaths in March 2020: 2720

Estimating excess deaths in PR is complicated because - There is natural variability (not just PR) - Population change changes - Demographics change

Incomplete data

Another problem is incomplete data

Population is decreasing and becoming older

Population is decreasing and becoming older

March in other years

We can’t just compare two years

In June did not see much reason for concern regarding COVID

  • But this could change rapidly

  • And will we know soon enough?

Typical years

María

Georges

COVID-19

Chikungunya

We couldn’t calculate the positivity rate

We get data from the PRHST

But

All we want is a table

We start noticing an increase

Increase observed at end of June

Meanwhile in Florida, where 27 flights come in from

We also start seeing increase in hospitalizations

Health Department shares an API

July 6 I have first version Dept Health data can’t share it

We are above 5%

Health Department shares an API

Once we have data in a nice form, showing data is easy

hosp <- read_xlsx("data/Dash Total.xlsx")
hosp %>% 
  ggplot(aes(Fecha, `Total de Personas Hospitalizadas COVID`)) +
  geom_point() + geom_smooth(span = 0.3)

We build a dashboard

We build a dashboard

Plots show up in governor’s press conference

Here is what needed up happening

Recommendations

  • Organize data systematically
  • Monitor with visualizations: positivity rate, hospitalizations, deaths
  • Do it by geographical regions
  • Data and code: https://github.com/rafalab/pr-covid

Acknowledgements

  • Rolando Acosta (Harvard)
  • Joshua Villafañe Delgado (Departamento de Salud)
  • Danilo Trinidad Perez Rivera (Departamento de Salud)
  • José A. López Rodriguez (Registro Demográfico)
  • Wanda Llovet Díaz (Registro Demográfico)
  • Marcos López Casillas (PRST)
  • José Rodriguez Orengo (PRST)
  • Daniel Colón Ramos (Yale)
  • Caroline Buckee, Michael Mina, Marc Lipsitch (Harvard)
  • Natalie Dean (University of Florida)